Automatic Assessment of Absolute Sentence Complexity
نویسندگان
چکیده
Lexically and syntactically simpler sentences result in shorter reading time and better understanding in many people. However, no reliable systems for automatic assessment of sentence complexity have been proposed so far. Instead, the assessment is usually done manually, requiring expert human annotators. To address this problem, we first define the sentence complexity assessment as a five-level classification task, and build a ‘gold standard’ dataset. Next, we propose robust systems for sentence complexity assessment, using a novel set of features based on leveraging lexical properties of freely available corpora, and investigate the impact of the feature type and corpus size on the classification performance.
منابع مشابه
Syntactic Complexity of Russian Unified State Exam Texts in English: A Study on Reliability and Validity
In this study we analyze texts used in Russian Unified State Exam on English language. Texts that formed small research corpora were retrieved from 2 resources: official USE database as a reference point, and popular website used by pupils for USE training “Neznaika” (https://neznaika.pro/). The size of two corpora is balanced: USE has 11934 tokens and “Neznaika” - 11918 tokens. We share Biber’...
متن کاملAutomatic Sentence Ordering Assessment Based on Similarity
One of the tasks of text generation is sentence ordering since it is crucial for readability. Nevertheless, there is no common approach for evaluation of sentence ordering. The state-ofthe art methods are based on the comparison with a humanprovided order. However, in many cases it is impossible or time and resource consuming. Therefore, we propose three completely automatic approaches for sent...
متن کاملبررسی شاخص های کیفیت گفتار در کودکان فارسی زبان طبیعی 5-4 ساله در شهرهای سمنان، بیرجند و تنکابن، سال 1383
Background and purpose: We can examine the language abilities of a person through five parameters of speech quality including speech fluency, speech complexity, speech exactness, speech rate and lexical accessibility. These parameters are examined by the secondary parameters including mean length of utterance (MLÜ), mean length of five long utterances, mean number of verb in sentence, mean nu...
متن کاملImprovement of generative adversarial networks for automatic text-to-image generation
This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...
متن کاملAn Improved Automatic EEG Signal Segmentation Method based on Generalized Likelihood Ratio
It is often needed to label electroencephalogram (EEG) signals by segments of similar characteristics that are particularly meaningful to clinicians and for assessment by neurophysiologists. Within each segment, the signals are considered statistically stationary, usually with similar characteristics such as amplitude and/or frequency. In order to detect the segments boundaries of a signal, we ...
متن کامل